An extension 2DPCA based visual feature extraction method for audio-visual speech recognition
نویسندگان
چکیده
Two dimensional principal component analysis (2DPCA) has been proposed for face recognition as an alternative to traditional PCA transform [1]. In this paper, we extend this approach to the visual feature extraction for audio-visual speech recognition (AVSR). First, a two-stage 2DPCA transform is conducted to extract the visual features. Then, the visemic linear discriminant analysis (LDA) is applied for post extraction processing. We investigate the presented method comparing with traditional PCA and 2DPCA. Experimental results show that the extension 2DPCA can reduce the dimension of 2DPCA and represent the testing mouth images better than PCA does; Moreover, 2DPCA+LDA needs less computation and has a better performance than PCA+LDA in the visual-only speech recognition; Finally, further experimental results demonstrate that our AVSR system using the extension 2DPCA method provides significant enhancement of robustness in noisy environments compared to the audio-only speech recognition.
منابع مشابه
Audio-Visual Speech Recognition for a Person with Severe Hearing Loss Using Deep Canonical Correlation Analysis
Recently, we proposed an audio-visual speech recognition system based on a neural network for a person with an articulation disorder resulting from severe hearing loss. In the case of a person with this type of articulation disorder, the speech style is quite different from that of people without hearing loss, making a speaker-independent acoustic model for unimpaired persons more or less usele...
متن کاملAudio-Visual Speech Recognition Using Bimodal-Trained Bottleneck Features for a Person with Severe Hearing Loss
In this paper, we propose an audio-visual speech recognition system for a person with an articulation disorder resulting from severe hearing loss. In the case of a person with this type of articulation disorder, the speech style is quite different from those of people without hearing loss that a speaker-independent acoustic model for unimpaired persons is hardly useful for recognizing it. The a...
متن کاملAudio-Visual Speech Recognition Using Convolutive Bottleneck Networks for a Person with Severe Hearing Loss
In this paper, we propose an audio-visual speech recognition system for a person with an articulation disorder resulting from severe hearing loss. In the case of a person with this type of articulation disorder, the speech style is quite different from with the result that of people without hearing loss that a speaker-independent model for unimpaired persons is hardly useful for recognizing it....
متن کاملAn Audio-visual Speech Recognition System for Testing New Audio-visual Databases
For past several decades, visual speech signal processing has been an attractive research topic for overcoming certain audio-only recognition problems. In recent years, there have been many automatic speech-reading systems proposed that combine audio and visual speech features. For all such systems, the objective of these audio-visual speech recognizers is to improve recognition accuracy, parti...
متن کاملMachine learning based Visual Evoked Potential (VEP) Signals Recognition
Introduction: Visual evoked potentials contain certain diagnostic information which have proved to be of importance in the visual systems functional integrity. Due to substantial decrease of amplitude in extra macular stimulation in commonly used pattern VEPs, differentiating normal and abnormal signals can prove to be quite an obstacle. Due to developments of use of machine l...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007